Search CORE

6 research outputs found

NIST Post-Quantum Cryptography- A Hardware Evaluation Study

Author: Deepraj Soni
Kanad Basu
Mohammed Nabeel
Ramesh Karri
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 16/05/2019
Field of study

Experts forecast that quantum computers can break classical cryptographic algorithms. Scientists are developing post quantum cryptographic (PQC) algorithms, that are invulnerable to quantum computer attacks. The National Institute of Standards and Technology (NIST) started a public evaluation process to standardize quantum-resistant public key algorithms. The objective of our study is to provide a hardware comparison of the NIST PQC competition candidates. For this, we use a High-Level Synthesis (HLS) hardware design methodology to map high-level C specifications of selected PQC candidates into both FPGA and ASIC implementations

Cryptology ePrint Archive

CoFHEE: A Co-processor for Fully Homomorphic Encryption Execution

Author: Ashraf Mohammed
Chielle Eduardo
Gamil Homer
Gebremichael Mizan Abraha
Karri Ramesh
Maniatakos Michail
Nabeel Mohammed
Sanduleanu Mihai
Soni Deepraj
Publication venue
Publication date: 19/04/2022
Field of study

The migration of computation to the cloud has raised privacy concerns as sensitive data becomes vulnerable to attacks since they need to be decrypted for processing. Fully Homomorphic Encryption (FHE) mitigates this issue as it enables meaningful computations to be performed directly on encrypted data. Nevertheless, FHE is orders of magnitude slower than unencrypted computation, which hinders its practicality and adoption. Therefore, improving FHE performance is essential for its real world deployment. In this paper, we present a year-long effort to design, implement, fabricate, and post-silicon validate a hardware accelerator for Fully Homomorphic Encryption dubbed CoFHEE. With a design area of

12mm^2

, CoFHEE aims to improve performance of ciphertext multiplications, the most demanding arithmetic FHE operation, by accelerating several primitive operations on polynomials, such as polynomial additions and subtractions, Hadamard product, and Number Theoretic Transform. CoFHEE supports polynomial degrees of up to

n = 2^{14}

with a maximum coefficient sizes of 128 bits, while it is capable of performing ciphertext multiplications entirely on chip for

n \leq 2^{13}

. CoFHEE is fabricated in 55nm CMOS technology and achieves 250 MHz with our custom-built low-power digital PLL design. In addition, our chip includes two communication interfaces to the host machine: UART and SPI. This manuscript presents all steps and design techniques in the ASIC development process, ranging from RTL design to fabrication and validation. We evaluate our chip with performance and power experiments and compare it against state-of-the-art software implementations and other ASIC designs. Developed RTL files are available in an open-source repository

arXiv.org e-Print Archive

TREBUCHET: Fully Homomorphic Encryption Accelerator for Deep Computation

Author: Badawi Ahmad Al
Canida Kellie
Cousins David Bruce
French Matthew
Gamil Homer
Jacob Ajey
Jaiswal Akhilesh
Maniatakos Michail
Mathew Clynn
Neda Negar
Polyakov Yuriy
Reagen Brandon
Reynwar Benedict
Schmidt Andrew
Soni Deepraj
Publication venue
Publication date: 11/04/2023
Field of study

Secure computation is of critical importance to not only the DoD, but across financial institutions, healthcare, and anywhere personally identifiable information (PII) is accessed. Traditional security techniques require data to be decrypted before performing any computation. When processed on untrusted systems the decrypted data is vulnerable to attacks to extract the sensitive information. To address these vulnerabilities Fully Homomorphic Encryption (FHE) keeps the data encrypted during computation and secures the results, even in these untrusted environments. However, FHE requires a significant amount of computation to perform equivalent unencrypted operations. To be useful, FHE must significantly close the computation gap (within 10x) to make encrypted processing practical. To accomplish this ambitious goal the TREBUCHET project is leading research and development in FHE processing hardware to accelerate deep computations on encrypted data, as part of the DARPA MTO Data Privacy for Virtual Environments (DPRIVE) program. We accelerate the major secure standardized FHE schemes (BGV, BFV, CKKS, FHEW, etc.) at >=128-bit security while integrating with the open-source PALISADE and OpenFHE libraries currently used in the DoD and in industry. We utilize a novel tile-based chip design with highly parallel ALUs optimized for vectorized 128b modulo arithmetic. The TREBUCHET coprocessor design provides a highly modular, flexible, and extensible FHE accelerator for easy reconfiguration, deployment, integration and application on other hardware form factors, such as System-on-Chip or alternate chip areas.Comment: 6 pages, 5figures, 2 table

arXiv.org e-Print Archive

RPU: The Ring Processing Unit

Author: Ahmad Al Badawi
Andrew Schmidt
Benedict Reynwar
Benjamin Heyman
Brandon Reagen
David Bruce Cousins
Deepraj Soni
Franz Franchetti
Homer Gamil
Kellie Canida
Massoud Pedram
Matthew French
Michail Maniatakos
Mohammed Nabeel Thari Moopan
Naifeng Zhang
Negar Neda
Yuriy Polyakov
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 30/03/2023
Field of study

Ring-Learning-with-Errors (RLWE) has emerged as the foundation of many important techniques for improving security and privacy, including homomorphic encryption and post-quantum cryptography. While promising, these techniques have received limited use due to their extreme overheads of running on general-purpose machines. In this paper, we present a novel vector Instruction Set Architecture (ISA) and microarchitecture for accelerating the ring-based computations of RLWE. The ISA, named B512, is developed to meet the needs of ring processing workloads while balancing high-performance and general-purpose programming support. Having an ISA rather than fixed hardware facilitates continued software improvement post-fabrication and the ability to support the evolving workloads. We then propose the ring processing unit (RPU), a high-performance, modular implementation of B512. The RPU has native large word modular arithmetic support, capabilities for very wide parallel processing, and a large capacity high-bandwidth scratchpad to meet the needs of ring processing. We address the challenges of programming the RPU using a newly developed SPIRAL backend. A configurable simulator is built to characterize design tradeoffs and quantify performance. The best performing design was implemented in RTL and used to validate simulator performance. In addition to our characterization, we show that a RPU using 20.5mm2 of GF 12nm can provide a speedup of 1485x over a CPU running a 64k, 128-bit NTT, a core RLWE workloa

Cryptology ePrint Archive

TREBUCHET: Fully Homomorphic Encryption Accelerator for Deep Computation

Author: Ahmad Al Badawi
Ajey Jacob
Akhilesh Jaiswal
Andrew Schmidt
Benedict Reynwar
Bo Zhang
Brandon Reagen
Clynn Mathew
David Bruce Cousins
Deepraj Soni
Franz Franchetti
Homer Gamil
Jeremy Johnson
Kellie Canida
Massoud Pedram
Matthew French
Michail Maniatakos
Mike Franusich
Naifeng Zhang
Negar Neda
Patrick Brinich
Patrick Broderick
Yuriy Polyakov
Zeming Cheng
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 18/04/2023
Field of study

Cryptology ePrint Archive

Scaling Up Hardware Accelerator Verification using A-QED with Functional Decomposition

Author: Barrett Clark
Carloni Luca
Chattopadhyay Saranyu
Chen Deming
Cong Jason
Karri Ramesh
Lonsing Florian
Mitra Subhasish
Piccolboni Luca
Soni Deepraj
Trippel Caroline
Wei Peng
Zhang Xiaofan
Zhang Zhiru
Zhou Yuan
Publication venue
Publication date: 17/08/2021
Field of study

2425211DARPA POSHNS

arXiv.org e-Print Archive

reposiTUm